Comparative Study of Clustering Algorithms using Overall SimSUX Similarity Function for XML Documents
نویسندگان
چکیده
منابع مشابه
An Overview of Similarity Measures for Clustering XML Documents
The large amount and heterogeneity of XML documents on the Web require the development of clustering techniques to group together similar documents. Documents can be grouped together according to their content, their structure, and links inside and among documents. For instance, grouping together documents with similar structures has interesting applications in the context of information extrac...
متن کاملSimilarity Metric for XML Documents
Since XML documents can be represented as trees, Based on traditional tree edit distance, this paper presents structural similarity metric for XML documents ,which is based on edge constraint, path constraint, and inclusive path constraint, and similarity metric based on machine learning with node costs. It extends scope for searching XML documents, and improves recall and precision for searchi...
متن کاملA Comparative Study of Some Clustering Algorithms on Shape Data
Recently, some statistical studies have been done using the shape data. One of these studies is clustering shape data, which is the main topic of this paper. We are going to study some clustering algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and ...
متن کاملClustering XML Documents Using Structural Summaries
This work presents a methodology for grouping structurally similar XML documents using clustering algorithms. Modeling XML documents with tree-like structures, we face the ‘clustering XML documents by structure’ problem as a ‘tree clustering’ problem, exploiting distances that estimate the similarity between those trees in terms of the hierarchical relationships of their nodes. We suggest the u...
متن کاملMeasuring Similarity between XML Documents
With the advance of World Wide Web standards, XML documents become popular in e-business applications for information exchange. Electronic catalogs and transaction records are now formatted in XML. XML documents are semi-structured documents with XML schemas marking up the semantics. XML separates presentation from semantics so that presentation of information on different devices can be proces...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: INTELIGENCIA ARTIFICIAL
سال: 2015
ISSN: 1988-3064
DOI: 10.4114/ia.v18i55.1097